A New Paradigm for Quantum Nonlocality 
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Bell's theorem [l], Q| basically states that local hidden variable theory cannot predict the corre- 
lations produced by quantum mechanics. It is based on the assumption that Alice and Bob can 
choose measurements from a measurement set containing more than one element. We establish a 
new paradigm that departs from Bell's paradigm by assuming that there are no choices for Alice 
and Bob and that the measurements Alice and Bob will make are fixed from the start. We include 
a process of quantum computation in our model. To the best of our knowledge, we are the first to 
connect quantum computation and nonlocality, the two faces of entanglement. 
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FIG. 1. A New Paradigm 



The charm and beauty of quantum mechanics is 
grounded on its counterintuitiveness to a great extent. 
And one of the most counterintuitive results ever in his- 
tory is Bell's theorem [2j , implying that quantum me- 
chanics violates either locality or counterfactual definite- 
ness. A simple and informal restatement of Bell's theo- 
rem is that local hidden variable theory cannot reproduce 
all of the predictions of quantum mechanics. 

We think about the power of entanglement and the 
simulation of entanglement from another prospective. In 
Bell's theorem, the measurements are versatile and can 
be chosen be with respect to various bases. Alice may 
choose to measure her state according to basis with an 
angle a relative to the standard basis; while Bob may 
choose another basis, which is j3 relative to the standard 
basis, for measurement. What is the case if we fix the 
measurement? Is it now possible to simulate "quantum 
mechanics" with local hidden variables? 



Next we shall make our model more formal and more 
concrete. The model is roughly illustrated by Figure 1. 

In Figure 1, \ip) represents an entangled state of size 
2Q (Q < n) qubits, where the first Q qubits are owned 
by Alice and the second Q qubits belong to Bob. We 
may also add some ancilla qubits to make sure that both 
Alice and Bob have n qubits. The state in the beginning 
is 0o- Alice applies unitary operation U and Bob applies 
unitary operation V. The state becomes <j>\ = (U ® V)<fo. 
U and V are fixed for Alice and Bob. The role of U and 
V can be seen as a step of quantum computation to make 
the quantum correlation harder to simulate classically. 

Then Alice and Bob both apply the measurement M, 
which is fixed to be with respect to the standard basis. 
M is assumed to be fixed from the start. In the end, 
they output a correlation (AT, Y) according to results of 
the measurement, where X and Y are random variables 
taking values in {0, 1}™. Here, we use {0, 1}™ to denote a 
set containing 2" binary strings of length n. For instance, 
if n = 2, then {0, l} 2 is a set containing 2 2 = 4 binary 
strings: {00, 01, 10, 11}. Now it is easy to understand X 
and Y. If n = 1, then both X and Y have two possible 
outcomes, i.e., and 1. X equals to with some proba- 
bility p (0 < p < 1) and equals to 1 with the remaining 
probability 1 — p. Y takes value with probability q 
(0 < q < 1) and takes value 1 with probability 1 — q. X 
and Y are correlated in the sense that their distributions 
are not independent. Suppose the distribution of (X,Y) 
is P r , and we want to approximately simulate P r using 
local hidden variables. Note that P r can be treated as a 
matrix, namely 



{P r ) xy = Prob{X = x,Y = y}, 



(1) 
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for all x,y G {0, 1}". 

There are two clear differences of our paradigm from 
Bell's. First, we add a process of quantum computation; 
namely, Alice and Bob apply unitary operations on the 
initial quantum state. In Bell's setting, there is no such 
consideration: Alice and Bob measure the quantum state 
directly. Secondly in Bell's model, Alice and Bob have 
the freedom to choose their measurement; while here the 
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measurement of Alice and Bob is prescribed from the 
start. As we will see, the key feature of the new model 
is that there is no so-called free-will and space-like sep- 
aration between Alice and Bob's measurement choices. 
Thus it is possible to build a local model consistent with 
quantum correlation. But we show that it is still very 
hard to simulate quantum correlation and there could be 
some pathology. 

For the classical simulation, Alice and Bob initially 
share some random bits (shared randomness, public 
coins, or local hidden variables). We denote the shared 
random variable as Z. Alice and Bob also have some 
private random bits, denoted as ta and rg respectively. 
They use Z, ta and tb to generate a correlation (X' , Y'), 
such that X' = f A (Z,r A ) and Y' = f B (Z,r B ). X' and 
Y' are random variables taking values in {0, 1}". Sup- 
pose that the probability distribution of (X',Y') is P c , 
we want to make sure that P c is close to P r . If P c and P r 
are close enough, then we succeed in simulating quantum 
correlations using local hidden variables. 

Now the question is: how to measure the distance be- 
tween two probability distributions P c and P r ? There 
are usually two ways. One is to allow some multiplica- 
tive error: 

(1 - P)(Pr) X y < (Pc) X y < (1 + P)(P r ) X y, (2) 

for all x, y G {0, 1}", where (3 is some constant satisfying 
< P < 1. In this case, we say that P c and P r are /3-close. 
The other one is to allow some additive error, which is 
used more frequently in practice. For two probability 
distributions D\ and D% over a space S (in our case, 
the space S is {0, 1}" x {0, 1}™), the variational distance 
between them is 

\\D 1 -D 2 \\ 1 =J2\Di(s)-D 2 (s)\. (3) 

ses 

If P c and P r satisfy 1 1 P c — P r \ \ \ < e, then we say that the 
variational distance between P c and P r is at most e, for 
some constant e > 0. 

For the multiplicative-error case, we have the following 
theorem. 

Theorem 1 There is an entangled state \tp) with Q = 1 
and some U and V , such that for any given constant 
< P < 1, at least log 2 (n) shared random bits are needed 
to produce P c , where P c and P r are /3-close. 

For the additive-error case, we show the following result. 

Theorem 2 There is an entangled state with Q = 
0(log 2 (n)), and some U and V , such that for any given 
constant e > 0, at least £l(y/n) shared random bits are 
needed to generate P c , where the variational distance be- 
tween P c and P r is at most e. 

In Theorem[2j the 0(log 2 (n)) notation (O is called big- 
Oh notation) means that Q is bounded above by log 2 (n) 



up to some constant and Q(y/n) (£1 is called big-Omega 
notation) notation means that the amount of shared ran- 
dom bits is bounded below by y/n up to some constant. 
In other words, Q is at most c\ log 2 (n) for some constant 
ci > and the amount of shared random bits needed is 
at least C2\fn, where c 2 > is some constant. 

Theorem [T] is an "log 2 (n) vs. 1" (super-exponential) 
separation and Theorem[5]is an "Q(y/n) vs. 0(log 2 (n))" 
(exponential) separation. Hence, we have shown that 
quantum entanglement is still much more powerful than 
classical shared randomness, even when we fix the mea- 
surement and consider the approximate simulation. This 
result significantly expands our knowledge on the power 
of entanglement and helps us better understand the prin- 
ciples of quantum entanglement. 

Thcorem[T]says that finite amount of local hidden vari- 
ables cannot account for quantum correlations even when 
the measurements Alice and Bob will make are prescribed 
from the start. For any finite amount of local hidden vari- 
ables, there are always quantum correlations it cannot 
explain. This is due to the fact that n can be arbitrarily 
large while Q — 1 is fixed. 

Theorem [3] is very interesting and subtle: local hid- 
den variables cannot account for quantum correlations 
even when the measurements Alice and Bob will make 
are prescribed from the start, which is delicate because 
it is not true for any finite set of measurements. That is, 
Alice and Bob share some number n of pairs and for any 
finite n, local hidden variables can produce the results 
of their measurements. Local hidden variables fail, not 
at any particular value of n, but in the limit as n tends 
to infinity, because the number of local hidden variables 
needed to account for the results grows exponentially. 
Maybe we can say that the problem is that there is no 
thermodynamic limit. That is to say, the number of hid- 
den variables per pair shared by Alice and Bob is not an 
intensive quantity. Our argument is not as convincing 
as Bell's, but it goes beyond Bell's theorem in the sense 
that it shows that if hidden variables explain quantum 
correlations, then there is some pathology in the expla- 
nation. 

We give a short overview of the proofs of Theorem 
[1] and [2] Theorem [1] is relatively easy to prove. We 
set the initial state to be the well-known Bell state and 
go on to compute P r . We construct a P r that is very 
hard to simulate using local hidden variables by setting 
a proper U and V, and thus complete the proof. The 
proof of theorem [2] is much more complicated. We first 
define a specific distribution P u . Based on P u , we set the 
initial state, U, V and make sure that P r , the distribution 
after measurement, is very close to P u . So if we want to 
guarantee that P c , the distribution generated classically, 
is close to P r , then we have to assure that P c is also close 
to P u . And we show that the classical simulation of P u 
is hard and thus complete the proof. 

Although the mathematics in the proof of Theorem 
[2] is very complicated, the intuitive idea is very simple: 
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to approximately generate a 2n-bit correlated classical 
information (here it means P u ), roughly log 2 (n) shared 
qubits are enough (here it is the initial quantum state 
from which we get P r ), but classically we need at least 
(roughly) y/n correlated bits (here this is the shared ran- 
dom bits from which we generate P c ). This exponential 
separation is inherent. We can imagine the following sit- 
uation. Suppose that Alice and Bob share some number 
n of pairs. If the initial state shared by Alice and Bob 
is a product state, then the probability distribution of 
their joint results would be simply the product of the 
probability distributions of their separate results and the 
amount of information needed to specify the probability 
distribution of their joint results would automatically be 
proportional to the number n of shared pairs. But if the 
initial state is an entangled state, like in our case, the 
number of terms in its Schmidt decomposition can be 
exponential in n, and the amount of information needed 
is not necessarily proportional to n. 

Proof of Theorem^ We set to be the Bell state 



|00) + 
V2 



(4) 



where the first qubit is owned by Alice while the second 
belongs to Bob. So Q = 1. We also fill some ancilla 
qubits. The state in the beginning is 



0o = |O"- 1 )|^)|O"- 1 ) 



(5) 



Then Alice applies unitary operation U and Bob applies 
unitary operation V . U and V are unitary matrices of 
size 2™ x 2™. The state becomes 



h = — f("o ® Vq + ui <£> vi), 
v2 



(6) 



where uq is the first column of U, vq is the first col- 
umn of V, U\ is the second column of U, and v\ is the 
(2™- 1 + l)-th column of V. Suppose N = 2™. After the 
measurement M, there are N x N possibilities, and 



(Pr)xy 



1 



u (x)v (y) +-ui(ir>i(y)| 2 , 



(7) 



for all x,y £ {0,1}". 

Next we set the proper U and V. Let {c x : x £ {0, 1}™} 
be a set of N distinct real numbers. If we take {c x : x £ 
{0, 1}™} to be proper values [3j satisfying 



(8) 



y l<x<y<n 

and let vq — u$ and v\ = — U\, we have 

p r = [{c y - c x ) 2 } xy = [~\u (x)u (y) - u 1 (x)u 1 (y)\ 2 } xy . 

(9) 

Here, v is the conjugate vector of v. Therefore, the diag- 
onal entries of such a P r are and its off-diagonal entries 
are non-zero. 



Suppose we want to use shared random bits to generate 
a P c that is /3-close to P r . Any P c has to satisfy 



(1 " P)(Pr) X y < {Pc)xy < (1 + P){Pr)xy 



(10) 



for allx,j/S{0,l}™. Thus, P c also has the property that 
its diagonal entries are and its off-diagonal entries are 
non-zero. Suppose that in the beginning Alice and Bob 
share a random variable Z, whose sample space is S. The 
size of S is bounded below by log 2 (N) [3( . Consequently, 



Iog 2 (|5|) > log 2 (n), 



(11) 



implying that we need at least log 2 (n) bits of shared ran- 
dom bits to approximately simulate P r . This completes 
the proof of Theorem [T] □ 
Proof of Theorem^' First, we want to define a prob- 
ability distribution P u over the space {0, 1}" x {0, l} 71 . 
Suppose x £ {0, 1}". x is a n-bit binary strings, and we 
use | a; | to denote the cardinality of x, namely the number 
of l's in x. For example, if n = 8 and x = 00111001, then 
the cardinality of x is |a;| = 4. Also suppose y £ {0, l} n . 
If Xi A yi = 0, for all i £ {1,2,..., n}, then we say that 
x and y are disjoint. For instance, if n = 4, x = 1100 
and y = 0011, then x and y are disjoint. Without loss of 
generality, we assume that n is a square number, such as 
1,4,9, 16 and so on, and that Aq = ( n "^™), N 2 = (J^) . 

Here the notation (^) means the binomial coefficient, 
namely the number of ways of selecting k things out of a 
group of n elements. We define a probability distribution 
P u (in the form of a matrix) to be 



{Pu)xy — 



1 



NiN 2 



(12) 



if and only if |x| = \y\ = y/n and x and y are disjoint; 
otherwise (P u )xy — 0. In words, P u is a uniform distri- 
bution over all the disjoint x and y whose cardinality are 
both y/n. Another matrix M u can be defined based on 



(13) 



for all x, y £ {0, 1}™. By spectral decomposition, there 
exists a unitary matrix U± such that 



M u = U X D U U\, 



(14) 



where D u is a diagonal matrix consisting of the spectrum 
Ao, Ai, . . . , A2«-i (Ao > Ai > • • • > A2«-i). 
Define K to be 



^ = mm{fc:^A?>l-i-}, 



(15) 



i<k 



and set Q to be \log 2 (K + 1)] , U to be U\, and V to be 
U\. Then in our model, 



1 K 



(16) 



i=0 
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1 K 

° = ^7E A *i 00 --- 0i ' 00 "- 0i >' 



i=0 



and 



cf>i = (E7i®E7i)^o, 



(18) 



where AT' is a normalization lactor and \ / 1 — %- < A 7 "' < 
1. It is not hard to verify that 



(19) 



where D r — diag{Ao, Ai, . . . , A_r-, 0, 0, ... , 0}. The varia- 
tional distance of P u and P r is at most e, and Q is at 
most (roughly) log 2 (n), namely Q = 0(log 2 (n)) [3|. 

Since ||P U — P r ||i < e, if we want to ensure ||P c —P r ||i < 
e, we have to guarantee ||P C — P u ||i < 2e; otherwise 



> 1 1 P pu up _p 



(20) 



which does not satisfy our aim. Next we shall see the 
amount of classical local hidden variables needed to pro- 
duce P c , where ||P C — P«||i < 2e. Suppose in the begin- 
ning Alice and Bob share a random variable Z, whose 
sample space is S. Conditional on Z , X' and Y' are in- 
dependent. We denote the distribution of (X',Y') under 
Z = z to be D z , which is a product distribution. Thus, 



Pc = J2 Pr °HZ = z}D z 



(21) 



the result in 13[ deals with exact simulation instead of 
approximate simulation, and is wrong [lij ]. [l5| is in an 
information theoretic setting, and players there have free 
will. Moreover, [l5| only discusses simulating some dis- 
tribution and does not discuss simulating the distribution 
generated from quantum measurement. 

Concluding Remarks: In a nutshell, we have estab- 
lished a new paradigm for quantum nonlocality by re- 
stricting the set of measurements and using quantum 
computation. We also obtained two separation results 
under the new model to show the hardness of classically 
explaining quantum mechanics using local hidden vari- 
ables. Our work is seminal in at least two aspects. First, 
the discussions of Bell's theorem are always complicated 
by counterfactual argumentation. In our paper, there are 
no countcrfactuals, so we have a fresh approach to and a 
new look at Bell's theorem. Secondly, quantum informa- 
tion theory relies on entanglement in two quite different 
contexts: as a resource for quantum computation and as 
a source for nonlocal correlations among different par- 
ties. It is strange and not understood that nonlocality is 
crucially linked with entanglement in the second context 
but not in the first. Quantum computation and nonlocal- 
ity are two faces of entanglement that we do not usually 
connect, and in this paper we create a direct interface 
between quantum computation and nonlocality. To the 
best of our knowledge, we are the first to do so. 

We are grateful to the anonymous referees for their 
insightful and helpful comments. 



The size of S is lower bounded by 2 n ^~> [3]. Thus, 
the amount of shared random bits needed is at least 
log 2 (S) = n(y/n). This completes the proof of Theorem 
1 □ 
Related Work: The idea of a great deal of papers (0-tll, 
etc.) is that local hidden variables augmented by com- 
munication could reproduce the results of quantum en- 
tanglement. Quantum entanglement also has plenty of 
applications in areas such as quantum teleportation [9 1 , 
superdense coding [1(J and quantum cryptography |11|. 
In our previous work [l2J , it is shown that at least n bits 
of local hidden variables are needed to exactly simulate 
the correlation generated from a 2-qubit Bell state. This 
is a "n vs. 1" (super-exponential) separation, and is very 
interesting theoretically. But it has a fatal shortcoming 
in the sense that it cannot be experimentally validated. 
There is simply no effective way to tell if two proba- 
bility distributions are exactly the same. As a result, 
in this paper we focus on the approximate simulation, 
which is intriguing in theory, and at the same time fea- 
sible in experiment. [l3[ is in a game theoretic setting 
and the players there have some free will. In addition, 
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